Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison
نویسندگان
چکیده
Word Sense Disambiguation is a longstanding task in Natural Language Processing, lying at the core of human language understanding. However, the evaluation of automatic systems has been problematic, mainly due to the lack of a reliable evaluation framework. In this paper we develop a unified evaluation framework and analyze the performance of various Word Sense Disambiguation systems in a fair setup. The results show that supervised systems clearly outperform knowledge-based models. Among the supervised systems, a linear classifier trained on conventional local features still proves to be a hard baseline to beat. Nonetheless, recent approaches exploiting neural networks on unlabeled corpora achieve promising results, surpassing this hard baseline in most test sets.
منابع مشابه
A Unified Multilingual Semantic Representation of Concepts
Semantic representation lies at the core of several applications in Natural Language Processing. However, most existing semantic representation techniques cannot be used effectively for the representation of individual word senses. We put forward a novel multilingual concept representation, called MUFFIN, which not only enables accurate representation of word senses in different languages, but ...
متن کاملA Large-Scale Pseudoword-Based Evaluation Framework for State-of-the-Art Word Sense Disambiguation
The evaluation of several tasks in lexical semantics is often limited by the lack of large numbers of manual annotations, not only for training purposes, but also for testing purposes. Word Sense Disambiguation (WSD) is a case in point, as hand-labeled data sets are particularly hard and time-consuming to create. Consequently, evaluations tend to be performed on a small scale, which does not al...
متن کاملWorst-case complexity and empirical evaluation of artificial intelligence methods for unsupervised word sense disambiguation
Word Sense Disambiguation (WSD) is a difficult problem for NLP. Algorithm that aim to solve the problem focus on the quality of the disambiguation alone and require considerable computational time. In this article we focus on the study of three unsupervised stochastic algorithms for WSD: a Genetic Algorithm (GA) and a Simulated Annealing algorithm (SA) from the state of the art and our own Ant ...
متن کاملQuality Assessment of Large Scale Knowledge Resources
This paper presents an empirical evaluation of the quality of publicly available large-scale knowledge resources. The study includes a wide range of manually and automatically derived large-scale knowledge resources. In order to establish a fair and neutral comparison, the quality of each knowledge resource is indirectly evaluated using the same method on a Word Sense Disambiguation task. The e...
متن کاملLatent Semantic Word Sense Induction and Disambiguation
In this paper, we present a unified model for the automatic induction of word senses from text, and the subsequent disambiguation of particular word instances using the automatically extracted sense inventory. The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space. Th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017